TUHOI: Trento Universal Human Object Interaction Dataset
نویسندگان
چکیده
This paper describes the Trento Universal Human Object Interaction dataset, TUHOI, which is dedicated to human object interactions in images.1 Recognizing human actions is an important yet challenging task. Most available datasets in this field are limited in numbers of actions and objects. A large dataset with various actions and human object interactions is needed for training and evaluating complicated and robust human action recognition systems, especially systems that combine knowledge learned from language and vision. We introduce an image collection with more than two thousand actions which have been annotated through crowdsourcing. We review publicly available datasets, describe the annotation process of our image collection and some statistics of this dataset. Finally, experimental results on the dataset including human action recognition based on objects and an analysis of the relation between human-object positions in images and prepositions in language are presented.
منابع مشابه
Disambiguating Visual Verbs
In this article, we introduce a new task, visual sense disambiguation for verbs: given an image and a verb, assign the correct sense of the verb, i.e., the one that describes the action depicted in the image. Just as textual word sense disambiguation is useful for a wide range of NLP tasks, visual sense disambiguation can be useful for multimodal tasks such as image retrieval, image description...
متن کاملUnsupervised Visual Sense Disambiguation for Verbs using Multimodal Embeddings
We introduce a new task, visual sense disambiguation for verbs: given an image and a verb, assign the correct sense of the verb, i.e., the one that describes the action depicted in the image. Just as textual word sense disambiguation is useful for a wide range of NLP tasks, visual sense disambiguation can be useful for multimodal tasks such as image retrieval, image description, and text illust...
متن کاملFinding Regions of Interest from Multimodal Human-Robot Interactions
Learning new concepts, such as object models, from humanrobot interactions entails different recognition capabilities on a robotic platform. This work proposes a hierarchical approach to address the extra challenges from natural interaction scenarios by exploiting multimodal data. First, a speech-guided recognition of the type of interaction happening is presented. This first step facilitates t...
متن کاملIdentification and Characterization of Human Behavior Patterns from Mobile Phone Data
Pavlos Paraskevopoulos University of Trento, Italy, Telecom Italia SKIL [email protected] Thanh-Cong Dinh University of Trento, Italy, Telecom Italia SKIL [email protected] Zolzaya Dashdorj University of Trento, Italy, Telecom Italia SKIL, Fondazione Bruno Kessler [email protected] Themis Palpanas University of Trento, Italy, [email protected] Luciano Serafini Fondazione ...
متن کاملMALMIR, SIKKA, FORSTER, MOVELLAN, COTTRELL: ACTIVE RECOGNITION OF GERMS1 Deep Q-learning for Active Recognition of GERMS: Baseline performance on a standardized dataset for active learning
In this paper, we introduce GERMS, a dataset designed to accelerate progress on active object recognition in the context of human robot interaction. GERMS consists of a collection of videos taken from the point of view of a humanoid robot that receives objects from humans and actively examines them. GERMS provides methods to simulate, evaluate, and compare active object recognition approaches t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014